Quad-Byte Transformation using Zero-frequency Bytes
نویسندگان
چکیده
Byte pair encoding (BPE) algorithm was suggested by P. Gage is to achieve data compression. It encodes all instances of most frequent byte-pair using zerofrequency byte in the source data. This process is repeated for maximum m possible number of passes until no further compression is possible, either because there are no more frequently occurring byte pairs or there are no more unused zero-frequency bytes to represent pairs. The proposed quadbyte transformation algorithm encodes 4-byte integers instead of byte pairs. Proposed QBT-Z (Quad-Byte Transformation using Zero-frequency bytes) is applied on data blocks of a file. Implementation of block-wise QBT-Z with varying number of passes is experimented with various test files. The aim is to minimize the compression time and achieve better compression rate. QBT-Z with m passes transforms only one quad-byte at a time using unused zero-frequency byte and the process is repeated maximum m possible number of times. With k-pass QBT-Z, k<<m, it transforms half of the possible most-frequent quad bytes in each pass except the last. In the last pass, it transforms all remaining possible quad-bytes. Algorithm is tested on 18 files of various types with a total size of nearly 40MB. Overall transformation rate (compression rate using transformation) is 17.04%, 18.91%, 19.25%, 19.31% and 22.75% using QBT-Z with 1, 2, 3, 4 and m passes respectively. It shows that transformation with large number passes increases reduction in transformed file size. Total transformation time taken by k-pass QBT-Z with 1, 2, 3 and 4 passes is only 7.629, 12.183, 16.091 and 20.994 seconds respectively; which is significantly small as compared to 268.434 seconds taken by m-pass QBT-Z. As the number of passes increases, compression is better with increased execution time. Reverse transformation is too fast, hardly taking 2 to 4 seconds. Keywords—data compression, transformation, quad-byte encoding, substitution using zero-frequency bytes
منابع مشابه
Byte Pair Transformation using Zero-Frequency Bytes with Varying Number of Passes
Byte pair encoding (BPE) algorithm was suggested by P. Gage is to achieve data compression. It encodes all instances of most frequent byte-pair using zero-frequency byte in the source data. This process is repeated for maximum m possible number of passes until no further compression is possible, either because there are no more frequently occurring byte pairs or there are no more unused zero-fr...
متن کاملSome results on RC4 in WPA
Motivated by the work of AlFardan et al 2013, in this paper we present several results related to RC4 non-randomness in WPA. We first prove the interesting zig-zag distribution of the first byte and the similar nature for the biases in the initial keystream bytes to zero. As we note, this zig-zag nature surfaces due to the dependency of first and second key bytes in WPA/TKIP, both derived from ...
متن کاملDependence in IV-Related Bytes of RC4 Key Enhances Vulnerabilities in WPA
The first three bytes of the RC4 key in WPA are public as they are derived from the public parameter IV, and this derivation leads to a strong mutual dependence between the first two bytes of the RC4 key. In this paper, we provide a disciplined study of RC4 biases resulting specifically in such a scenario. Motivated by the work of AlFardan et al. (2013), we first prove the interesting sawtooth ...
متن کاملA Collision Attack on a Double-Block-Length Compression Function Instantiated with 8-/9-Round AES-256
f0(h0∥h1,M) = Eh1∥M(h0) ⊕ h0 , f1(h0∥h1,M) = Eh1∥M(h0 ⊕ c) ⊕ h0 ⊕ c , where ∥ represents concatenation, E is AES-256 and c is a 16-byte nonzero constant. The proposed attack is a free-start collision attack using the rebound attack proposed by Mendel et al. The success of the proposed attack largely depends on the configuration of the constant c: the number of its non-zero bytes and their posit...
متن کاملAttack on Broadcast RC4 Revisited
In this paper, contrary to the claim of Mantin and Shamir (FSE 2001), we prove that there exist biases in the initial bytes (3 to 255) of the RC4 keystream towards zero. These biases immediately provide distinguishers for RC4. Additionally, the attack on broadcast RC4 to recover the second byte of the plaintext can be extended to recover the bytes 3 to 255 of the plaintext given Ω(N) many ciphe...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2014